Cache device and control method for controlling cache memories in a multiprocessor system

ABSTRACT

In the case that at the time of generation of a pre-fetch request following a read request from one of the processors the data stored in other cache devices cannot be read unless its state tag is changed, a cache controller carries out weak read operation for causing failure in the pre-fetch request as a fetch protocol. Alternatively, the cache controller reads pre-fetch data without changing state tags of other cache devices, sets a weak read state (W), and stores the data. The data in the weak read state (W) is invalided by synchronization operation of memory consistency by software. Furthermore, the pre-fetch data is stored in a passive preservation mode in the present cache device. Even if the pre-fetch data corresponds to a read request from some other cache device, the preservation of the data is not informed to the other cache device.

TECHNICAL FIELD

The present invention relates to a cache device and a cache controlmethod for controlling cache memories in a multiprocessor system, and inparticular to a cache device and a cache control method for controllingdata pre-fetched in cache memories appropriately.

BACKGROUND ART

A cache device is one effective means for shortening memory access timein a computer. A small-capacity and high-speed memory called a cachememory is added to a processors and data accessed once is stored in thememory. Thus, when the data is next accessed, the data can be given backat a high speed.

FIG. 1 illustrates a cache mechanism of a multiprocessor system.Processors 102-1, 102-2, to 102-n have cache devices 100-1, 100-2, to100-n, respectively, and these cache devices are interconnected to amain memory 108 through an interconnecting network 106. In order thatthe cache device can work effectively, it is desirable to store datarequired by the processor in the cache as much as possible. In otherwords, if the data required by the processor is frequently not presentin the cache, then the low-speed main memory must be accessed manytimes. Thus, the average memory access times is lowered. Particularly ina multiprocessor system, plural processors access the same memory;therefore, access-confliction is caused so that the average of theaccess speeds to the main memory is further lowered. For this reason, itis a very important theme in computer systems using the cache device tostore the data required by the processor in the cache device.

In current cache devices, time locality and spatial locality of memoryaccess are used to improve the ratio of hits on the cache devices. Timelocality is the concept that data accessed once is likely to be accessedagain soon, and is used in a manner, such as LRU, in which data accessedonce is made not to be easily forced out, in the cache devices. Spatiallocality is the concept that data near the data accessed once are likelyto be accessed. This concept is used, in the cache devices, as shown bya cache line 111 of a cache memory 110 in FIG. 2. Namely, the concept isused in a such manner of storing, in a cache array 114 following anaddress 112, four block data including three block data (that is, blockdata 116-2 to 116-4 following an accessed block data 116-1); andmanaging concerned data in unit cache blocks. The spatial locality,which is different from the time locality, uses a method of taking, inadvance, even data that is not actually required by the processor in thecache device. If this method is further developed, it becomes possibleto use a method of storing, in advance, blocks that will be shortlyrequired by the processor in the cache. This method is called pre-fetch.By using the pre-fetch, the ratio of hits on the cache device is furtherimproved so that the access time to the memory can be shortened. Thispre-fetch Is an effective manner not only for a single processor systembut also for a multiprocessor system. In the multiprocessor system,however, a new problem of useless sharing arises.

In the cache system in a multiprocessor system, cache coherence ismanaged such that inconsistency between a cache device in one processorand a cache device in another processor is not caused. For example, asshown in FIG. 3A, data stored in a plurality of cache devices 100-1 to100-n are shared. In the case that the processor 100-n performs writingon the shared data, the writing is performed after the processor 100-ninforms the other cache devices 100-1 and 102 having the same data thatthe writing will be performed to make the present data in the cachedevices invalid without fail, as shown in FIG. 3B. By the invalidation,the other cache devices can know that the data that they have are notnewest. The method that all processors can read the newest data at thetime of the reading in this manner is cache coherency management. In thepre-fetch, one cache device predicts data that will be requested beforelong and reads the data as well as data required by the processor.However, this prediction does not necessarily prove to be right. Thus,useless data may be read. Even in the case of a single processor system,useless reading by the pre-fetch causes a problem, such as uselesstraffic between the main memory and a cache. In the case of amultiprocessor system, not only the useless traffic but also uselesssharing arises. In other words, data that is not shared in the methodsof reading only required data may be shared by plural caches in themethods of using the pre-fetch. It is necessary that at the time ofwriting onto the shared data in the cache device, the cache deviceinforms the other cache devices that the writing will be performed. Asfar as this processing of informing the other cache devices is notfinished, any data cannot be renewed. Therefore, the writing on theshared data is heavy processing, that is, processing which requires muchtime in the cache device. As a result, in the pre-fetch in anymultiprocessor system, the drawbacks of the useless traffic and theuseless sharing cancel the advantage of the pre-fetch. Thus, themultiprocessor system does not exhibit superior performance. Asdescribed above, in conventional cache devices, pre-fetch, whichimproves the ratio of hits on the caches, also results in an increase inoverhead at the time of writing due to useless sharing and an increasein data-transmission by the pre-fetch. As a result, a problem that thepre-fetch is not easily applied to multiprocessor systems arises.

SUMMARY OF THE INVENTION

An object of the present invention is to provide a cache memory deviceand a cache memory device control method which effectively usespre-fetch in cache devices of a multiprocessor system to improve a cachehit ratio without an increase in overhead or an increase indata-transmission.

[First Weak Read]

The present invention is a cache device set up in each of processors,interconnected to other cache devices in other processors and connectedto a main memory. The device comprises a cache memory and a cachecontroller. In the cache memory, a part of data in the main memory isstored in one or more cache lines and a state tag using to manage dataconsistency is set up in each of the cache lines. The cache controllercarries out, in the case that at the time of generation of a pre-fetchrequest following a read request from one of the processors data stored(i.e., preserved) in the other cache devices cannot be read unless itsstate tag is changed, weak read operation (first weak read) for causingfailure in the pre-fetch request as a fetch protocol. In conventionalcache devices, data must be read without fail if a read request isgenerated. Therefore, heavy processing for managing consistency, thatis, the processing of changing the states of other caches andsubsequently reading the data is necessary. However, the cache device ofthe present invention is provided with a pre-fetch protocol. Sincepre-fetch is a speculative memory access, which may not be usedafterwards, it is unnecessary to cause the reading based thereon tosucceed surely. For this reason, the above-mentioned pre-fetch protocolis made to a protocol that when the states of the other cache devicesare changed, the reading is caused to result in failure. Thus, weakreading operation, which may end in failure, is realized. This pre-fetchprotocol causes exclusion of any pre-fetch resulting in the change inthe states of the other cache devices, so as to reduce overheadaccompanying the pre-fetch read request and writing on the datacorresponding thereto.

The cache memory distinguishes the stored data by four states, that is,a data-modified state (M), an exclusive state (E), a data-shared state(S), and an invalid state (I), each of which indicates effectiveness ofthe state tag (an MESI protocol). The cache controller causes failure inthe pre-fetch request on the basis of the present pre-fetch protocolwhen the data corresponding to the pre-fetch request and stored in theother cache devices generated following a read request from one of theprocessors is in the data-modified state (M) or the exclusive state (E).By using this pre-fetch protocol, it is possible to prevent the statesof the cache devices in the exclusive state (E) or the data-modifiedstate (M) from being changed and to reduce overhead accompanying thepre-fetch from the cache devices storing the data in the exclusive state(E) or the data-modified state (M).

The cache controller reads, when the data corresponding to the pre-fetchrequest and stored in the other cache devices is in the invalid state(I), the same data from the main memory and stores the same data in theexclusive state (E) in the cache memory; and when the data is in thedata-shared state (S), the cache controller reads the data from theother cache devices and stores the data in the data-shared state (S) inthe cache memory. The protocol in this case is based on the MESIprotocol that is normal.

[Second Weak Read]

In the cache device of another embodiment of the present invention, acache controller reads, in the case that at the time of generation of apre-fetch request following a read request from one of the processorsthe data stored in the other cache devices cannot be read withoutchanging its state tag, the data without changing the state tag torespond to the processor and stores the data in the cache memory withthe setup of a weak state (W). Thereafter, the cache controller carriesout, at the time of synchronization operation of memory consistency toattain data-consistency by software, a pre-fetch protocol that the datain the cache memory in the weak state (W) is wholly invalidated. Theread operation for storing the pre-fetch data in this weak state iscalled a second weak read.

This second weak read uses the concept that in the memory consistencymodel accompanied by the synchronization operation for keeping weakconsistency by software, the order of memory operations between thesynchronization operations is arbitrary. In the case that this memoryconsistency model is used to perform reading of pre-fetch data whichrequires a change in the states of the other cache devices, the data thereading of which ends in failure by the first weak read is read by thesecond weak read so that the pre-fetch is normally ended. Afterwards,the present protocol reaches a synchronization point so that asynchronization message is delivered to the respective cache devices. Asa result, the respective cache devices search data in the weak state (W)and invalidate all of the data. In this way, order-exchange over thesynchronization point is prevented. Thus, the second weak read satisfiesa requirement of a memory consistency model.

In the MESI protocol as an example, the cache controller reads, whendata that corresponds to the pre-fetch request and that is stored in theother cache devices is in the data-modified state (M) or the exclusivestate (E), the data without changing its state tag, sets up the weakstate (W) and stores the data in the cache memory. At the time ofsynchronization operation of memory consistency, the weak state (W) iswholly changed to the invalid state (I).

Therefore, even if the data is pre-fetched in the weak state (W) andsubsequently the processor renews the data that the other cache deviceshave in the exclusive state (E) or the data-modified state (M), therenewal is not conveyed to the cache device in the weak state (W). Bythis fact, however, it can be recognized that in the cache device inwhich the data is stored in the weak state (W) and the cache devices inwhich the data is stored in the data-modified state (M) or the exclusivestate (E), the renewal order of the data is changed. This satisfies amemory consistency model. Afterwards, the present protocol reaches asynchronization point so that a synchronization message is delivered tothe respective cache devices. As a result, the respective cache devicessearch data in the weak state (W) and invalidate all of the data. Inthis way, order-exchange over the synchronization point is prevented.Thus, the second weak read satisfies a requirement of a memoryconsistency model.

The cache controller reads, when the data corresponding to the pre-fetchrequest and stored in the other cache devices is in the invalid state(I), the same data from the main memory and stores the same data in theexclusive state (E) in the cache memory; and when the data is in thedata-shared state (S), the cache controller reads the data from theother cache devices and stores the data in the data-shared state (S) inthe cache memory. This case follows the MESI protocol.

[Passive Preservation Mode]

In the cache device of a further embodiment of the present invention, acache controller set ups, at the time of generation of a pre-fetchrequest following a read request from one of the processors, a passivepreservation mode P to data pre-fetched from the other cache devices orthe main memory and then stores the data in the cache memory;

I. when the data corresponding to the read request from the other cachedevice is the pre-fetch data to which the passive preservation mode P isset up, the other cache device is not informed of the preservation ofthe corresponding data;

II. when none of the other cache devices store the corresponding data,the pre-fetch data is invalidated; and

III. when the other cache devices share the corresponding data, thepre-fetch data is stored as it is.

The first and second weak read operations are operations in the casethat, at the time of a pre-fetch request, a read request is made toother cache devices. Conversely, in a cache device in a multiprocessorsystem, the state of pre-fetch data stored in its cache memory may bechanged by reading from some other cache device. Thus, the passivepreservation mode (P) is set up, as a symbol indicating passivepreservation for giving the other cache devices priority, to thepre-fetch data read from the main memory or the other cache devices, sothat the data is not fetched even if a read request from some othercache device is recognized. Thus, the other cache device is prohibitedfrom accessing the pre-fetch data. For this reason, about the pre-fetchdata that would not be probably used actually but is stored by way ofprecaution, transition to a useless sharing state by state-change isreduced. Overhead at the time of writing read data on the cache deviceis reduced.

When the passive preservation mode and the weak reading overlap witheach other, they can coexist by regarding data stored in the passivepreservation mode as not exclusive (E). The cache device in which thedata is stored in the passive preservation mode considers that thiscache device does not have any data whether or not the cache deviceitself is exclusive (E), and then waits for information on the states ofthe other cache devices. Thus, transition to a useless sharing state canbe avoided. In the cache device having, for example, the MESI protocol,the cache controller changes, when all of the other cache devices areinvalid (I) or either one thereof is in the data-modified state (M) orthe exclusive state (E) in the case that the data corresponding to aread request from some other cache device is pre-fetch data to which thepassive preservation mode (P) is set up, the pre-fetch data stored inthe passive preservation mode (P) into the invalid state (I) so that thedata is not fetched. When the other cache devices are in the data-sharedstate(S), the pre-fetch data stored in the passive preservation mode (P)is kept as it is.

A normal preservation mode (N) is set up to data other than thepre-fetch data in the passive preservation mode (P) stored In the cachememory, and data-preservation in the passive preservation mode (P) anddata-preservation in the normal preservation mode (N) are carried out inthe respective cache lines, and caused to exist together.

It is difficult to apply the pre-fetch protocol of the present inventionto normal data. Specifically, a normal data read request is generatedwhen data becomes necessary in the processor. The fact that the readingof the data ends in failure and then the data is not obtained results ina drop in the operation performance of the processor. Therefore, thefirst weak read, the second weak read, and the passive preservation mode(P) of the present invention are applied only to reading of speculativedata such as pre-fetch. In other words, normal data is stored in thenormal preservation mode (N) in the cache device. Pre-fetch data, whichis passive preservation data, is stored in the passive preservation mode(P). They are distinguished in the cache memory. In this manner, thecache device wherein preservation modes are set in the respective cachelines can be made.

When the cache controller receives a read request from the prospector,the cache controller performs a pre-fetch request of one or moreaddresses ADR+n adjacent to the read-requested address ADR following theread request. The pre-fetch request accompanying the read request formthe processor is usually sent to the respective cache devices inseparate command cycles. In the present invention, however, a pre-fetchrequest is embedded in a normal read request. When a read request of theaddress ADR is sent to the cache device, for example, in the case thatpre-fetch of adjacent blocks is adopted, this request is also handled asa read request of an adjacent block ADR+1. The processor can thereforeobtain the pre-fetch data without generating any read request of theaddress ADR+n for the pre-fetch.

Of course, any one of the first weak read, the second weak read, and thepassive preservation mode (P), or any combination thereof is used toperform the pre-fetch request at this time. In other words, requests ofdifferent protocols to different addresses are put together into onerequest. Overhead by excessive requests is reduced by putting requeststo the cache device together as described above.

The cache controller of the cache device carries out, when the cachecontroller receives the read request from the processor, a pre-fetchrequest of one or more addresses adjacent to a read-requested addressafter the read request. Specifically, as the interconnecting network forconnecting the cache devices and the main memory, a snoop bus is usedfor outputting, the preservation states of the corresponding data intostate controlling lines when the cache controller receives a readrequest from its own processor or some other cache device. These statecontrolling lines are a state controlling line that corresponds to andis exclusive for the read request and a state controlling line thatcorresponds to and is exclusive for a pre-fetch request, the readrequest and the pre-fetch request being carried out at the same time.The states of the respective cache devices about the address of therequested data and an address of the pre-fetch requested data areconveyed at the same time.

One of problems arising when pre-fetch is performed in a cache device ina multiprocessor is detection of states of other cache devices. In anormal read request, it is sufficient that the state detection of thecache device is performed only about a requested address. However, ifpre-fetch is introduced, it is necessary that the state detection of thecache device is performed about a pre-fetched address as well as therequested address. Thus, a state signal line corresponding to a readrequest and a state signal line corresponding to a pre-fetch request areset up in a common bus so that results of the state-detection aresimultaneously conveyed. Thus, overhead at the time of the request isreduced.

In the cache controller of the cache device, in reply to thesimultaneous requests of the read request and the pre-fetch requestusing the exclusive state signal lines, a distinguishing bit fordistinguishing data in response to the read request and data in responseto the pre-fetch request is fitted up to a response header, and datamaking the distinguishing bit valid are transmitted in a lump. In thismanner, overhead at the time of the reply is reduced.

The fact that the read request and the pre-fetch read request aresimultaneously issued is used to put replies to the two requeststogether. At this time, for the pre-fetch read request the weakpre-fetch protocol is used; therefore, reading may result in failure.When the pre-fetch ends in failure, a data section in the reply is cutoff and then the resultant is sent to the request source. By observingthe validity of its distinguishing bit at this time, it can beunderstood whether the sent data is a reply to the read request, a replyto the pre-fetch request, or a reply to both. Effectivedata-transmission can be realized by putting replies to plural requeststogether into one reply and cutting off the section in which no data ispresent.

The present invention also provides a method for controlling a cachesystem wherein cache devices set up in respective processors aremutually connected through an interconnecting network and are connectedto a main memory.

The control method for controlling the cache system, wherein the firstweak read is performed, comprises the steps of:

I. storing a part of data in the main memory in one or more cache lineson cache memory and setting up a state tag using to manage dataconsistency in each of the cache lines, and

II. carrying out, in the case that at the time of generation of apre-fetch request following a read request from one of the processorsthe data stored in the other cache devices cannot be read unless itsstate tag is changed, weak read operation for causing failure in thepre-fetch request as a fetch protocol.

The control method for controlling the cache system, wherein the secondweak read is performed, comprises the steps of:

I. storing a part of data in the memory in one or more cache lines oncache memory and setting up a state tag using to manage data consistencyin each of the cache lines,

II. reading, in the case that at the time of generation of a pre-fetchrequest following a read request from one of the processors the datastored in the other cache devices cannot be read without changing itsstate tag, the data without changing the state tag to respond to theprocessor, and subsequently storing the data, with the setup of a weakstate (W), in the cache memory, and

III. invalidating, at the time of synchronization operation of memoryconsistency to attain data-consistency by software, the data in thecache memory in the weak state (W) wholly.

The control method for controlling the cache system, wherein a passivepreservation mode (P) is adopted, comprises the steps of:

I. storing a part of data in the main memory in one or more cache lineson cache memory and setting up a state tag using to manage dataconsistency in each of the cache lines,

II. setting, at the time of generation of a pre-fetch request followinga read request from one of the processors, the passive preservation mode(P) to data pre-fetched from the other cache devices or the main memoryand storing the data in the cache memory,

III. not informing, when data corresponding to the read request from theother cache device is the pre-fetch data to which the passivepreservation mode P is set, the other cache device of preservation ofthe corresponding data, and

IV. invalidating the pre-fetch data when none of the other cache devicesstore the corresponding data, and storing the pre-fetch data as it iswhen the corresponding data is shared by the other cache devices.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a conventional cache system relating to amultiprocessor system;

FIG. 2 is an explanatory view of a cache memory;

FIGS. 3A and 3B are explanatory views of a read request, which isfollowed by change in states of other cache devices to keep cachecoherency;

FIG. 4 is a block diagram of a multiprocessor system using cache devicesof the present invention;

FIG. 5 is a block diagram of a circuit wherein a common bus forconnecting the cache devices shown in FIG. 4 is made to a snoop bus;

FIGS. 6A and 6B are diagrams of state transition of an MESI protocol;

FIG. 7 is a block diagram of a cache device of the present invention;

FIG. 8 is an explanatory view of a cache line in a cache memory shown inFIG. 7;

FIG. 9 is an explanatory view of correspondence of tag bits shown inFIG. 7 and the states of stored data;

FIG. 10 is a flowchart of pre-fetch processing by first weak read;

FIG. 11 is an explanatory view of a tag bit having a weak state (W) usedin second weak read;

FIG. 12 is a flowchart of pre-fetch processing by the second weak read;

FIG. 13 is an explanatory view of a cache line for setting a passivepreservation mode (P);

FIG. 14 is a flowchart of pre-fetch processing to the data preserved inthe passive preservation mode (P);

FIG. 15 is a circuit block diagram of a snoop bus making it possible tocarry out state detection by a read request and state detection by apre-fetch request at the same time;

FIG. 16 is an explanatory view of a request format wherein a pre-fetchrequest is embedded in a read request;

FIG. 17 is an explanatory view of a reply format to the request formatshown in FIG. 16; and

FIG. 18 is an explanatory view of correspondence of the reply shown inFIG. 17 to the request shown in FIG. 16.

DETAILED DESCRIPTION OF THE INVENTION

[System Structure]

FIG. 4 is a block diagram of a multiprocessor system to which the cachedevices of the present invention are applied. The multiprocessor hasprocessors 12-1, 12-2, to 12-n, and the processors 12-1 to 12-n areprovided with cache devices 10-1, 10-2, to 10-n of the presentinvention, respectively. The processors 12-1 to 12-n are interconnectedthrough the cache devices 10-1 to 10-n to a common bus 14 as aninterconnecting network. A main memory 18, the number of which is onlyone in the system, is connected through a memory controller 16 to thecommon bus 14.

FIG. 5 illustrates details of the common bus 14 to the cache device setup to the multiprocessor system shown in FIG. 4. The common bus 14 isprovided with an EX line 20, a Hit line 22, and a HITM line 24, as statesignal lines, into which results obtained by examining the state of thecache devices 10 are caused to flow. The common bus 14 is also providedwith an address bus 26, a data bus 28 and a command bus 29. As thiscommon bus 14, a snoop bus is used. The cache device 10 is provided witha cache controller 30 and a cache memory 32. The EX line 20, the HITline 22 and the HITM line 24 of the common bus 14 are shared by thecache controllers 30 of the respective cache devices. These lines aredefined so as to be asserted by specific timing from a memorytransaction. Namely, when a certain memory transaction is generated fromthe cache controller 30, a request message is sent out to the common busso that the cache controller of each of the external cache devicesexamines whether the data corresponding to the request address of thetransaction is present in its cache memory 32 or not and, if present,what state the data falls in. The obtained result is conveyed throughthe EX line 20, the HIT line 22 or the HITM line 24, dependently on theexamined state. As a protocol for keeping consistency of data in thecache devices of the present invention, an MESI protocol is adopted.

FIG. 6A illustrates state transition of the MESI protocol, in responseto a read request, and FIG. 6B illustrates state transition of the MESIprotocol, in response to a writing request. Symbols in the statetransition in FIGS. 6A and 6B represent the following:

M: a data-modified state, in which renewed data is preserved in only oneof the plural caches;

E: an exclusive state, in which valid data that is not renewed ispreserved in only one of the plural caches;

S: a data-shared state, in which the same data is preserved in theplural caches;

I: an invalid state, in which data on the cache is invalid;

self: a request from the self-processor is processed;

other: a request from some other cache device is processed;

self-if-copy: the state for invalidating a read request, in which validdata is copied from some other cache; and self-if no copy: the state forinvalidating a read request, in which valid data is copied from the mainmemory.

In the case that, in the common bus 14 shown in FIG. 5, the datacorresponding to the request address of the memory transaction ispresent in the cache and the cache controller 30 examines the state ofthe data, the cache controller 30 asserts the EX line 20 on basis ofthis MESI protocol if the state is exclusive (E). The controller 30asserts the HIT line 22 if the state is shared (S). The controller 30asserts the HITM line 24 if the state is modified (M). In this manner,the cache controller 30 of each of the cache devices can judge whetherthe data corresponding to the request message, based on the memorytransaction, sent into the common bus 14 at the present time is in thefollowing state I, II, III, or IV:

I: the data is exclusive (E) and is preserved in any one of the caches;

II: the data is shared (S) and is preserved in the plural cachesdevices;

III: the data is modified (M) and is preserved in any one of the caches;and

IV: the data is present in no caches and is invalid (I).

FIG. 7 is a block diagram illustrating, in detail, functions of thecache controller 30 and the cache memory 32 in the cache deviceillustrated in FIG. 5. The cache device 10 is mainly composed of thecache controller 30 and the cache memory 32. The cache controller 30manages the action of the cache memory 32, and receives a data requestfrom the processor 12 or issues a request message to the common bus 14.The cache controller 30 also monitors traffic on the common bus 14constantly. If a request message from some other cache device is sentinto the bus, the cache controller 30 performs cache operationcorresponding to this request message. The cache controller 32 isprovided with a cache control managing unit 34, a processor interface36, a bus interface 38, a read protocol processing unit 40, a pre-fetchprotocol processing unit 42, and a writing protocol processing unit 44.The read protocol processing unit 40 carries out processing for cachecoherency management, following normal read operation, in accordancewith the MESI read protocol shown in FIG. 6A. The writing protocolprocessing unit 44 carries out processing for keeping data consistency,in accordance with the MESI writing protocol shown in FIG. 6B. Thepre-fetch protocol processing unit 42, which is newly set up in thepresent invention, is composed of a first weak protocol read processingunit 45, a second weak protocol read processing unit 46 and a passivereading mode processing unit 48. In the case that, at the time ofgenerating a pre-fetch request accompanying a memory access request ofthe processor 12, data stored in some other cache device cannot be readunless the state thereof is changed, the first weak protocol readprocessing unit 45 carries out a weak reading operation for causingfailure in the pre-fetch request. In the case that, at the time ofgenerating a pre-fetch request accompanying a memory access request ofthe processor 12, data stored in some other cache device cannot be readunless the state thereof is changed, the second weak protocol processingunit 46 forcibly reads the data without changing the state, sets up aweak state (W) indicating weak protocol read, and stores this state inthe cache memory. The second weak protocol read processing unit 46 iscombined with a memory consistency model for attaining consistency ofdata by software at the side of the processor 12. This unit 46invalidates all data in the cache memory that are in the weak state (W)when the unit 46 receives a synchronization message caused to flow toattain the consistency of the data. While the first weak protocol readprocessing unit 45 and the second weak protocol read processing unit 46carry out a weak reading operation when data is pre-fetched from someother cache device, the passive reading mode processing unit 48 carriesout processing in the case that pre-fetch data is stored in the addresscorresponding to a memory access request by some other cache device.Specifically, in the case that the mode (P) representing the passivepreservation mode is set and preserved at the time of storing pre-fetchdata in the cache memory 32 and data corresponding to a read request bysome other cache devices is the pre-fetch data to which the passivepreservation mode (P) is set up, the passive reading mode processingunit 48 does not inform the other cache device of the preservation ofthis pre-fetch data to exclude the use of the pre-fetch data by theother caches device. In the case that data in all of the other cachedevices are invalid (I) or data is preserved in the exclusive state (E)or data-modified state (M), the pre-fetch data to which the passivepreservation mode (P) is set up is made invalid (I). In the case thatthe corresponding data is shared (S) by the other devices, the pre-fetchdata is preserved as it is, that is, without changing its state. Aboutthe first weak protocol read processing unit 45, the second weakprotocol read processing unit 46 and the passive reading mode processingunit 48, which are set up in the pre-fetch protocol processing unit 42of the cache controller 30, any one of the following combinations isselectively set up by setup-processing at the time of starting thedevice, or the like processing:

I: a combination of the first weak protocol read processing unit 45 andthe passive reading mode processing unit 48; and

II: a combination of the second weak protocol read processing unit 46and the passive reading mode processing unit 48.

Of course, it is allowable to fixedly set up, as the function of thecache controller 30, either one function of the first or second weakprotocol read processing unit 45 or 46, and the function of the passivereading mode processing unit 48. The cache memory 32 is a spot forstoring a copy of a part of data in the main memory 18, and a cash line50 is composed of a tag 54, a key 56, and a cache array 58. The cachearray 58 is a spot for storing data. A copy of data in the main memory18 and data renewed by the processor 12 are stored in the cache array58. The cache arrays 56 are managed in a block unit. One block thereofis composed of plural data that the processor 12 processes. The tag 54and the key 56 represent the state of each of the blocks in the cachearrays 58, and correspond to each of the cache arrays 58. As the cachememory 32, there is generally used a set associative cache, which arecaused to have plural data corresponding to a certain index (the nthentry). To make description simple in the present embodiment, a directmap cache, wherein one address corresponds to one index, is used. Thekey 56 indicates the address, from the memory of which the data in thecache array 58 is copied. The key combined with the index defines onlyone address. In the case of the direct map cache, the key directlyrepresents the address since the index and the address have one-to-onecorrespondence relationship. The tag 54 represents the state of the datablock stored In the cache array 58. As shown in FIG. 8, the tag 54 iscomposed of 2 bits b1 and b2. By the 2-bit data b1 and b2 in the tag 54,data preserved by the MESI protocol shown in FIGS. 6A and 6B are asshown in FIG. 9.

[First Weak Protocol Read]

As a first embodiment of pre-fetch protocol processing in the cachedevices 10 of the present invention, the following will describepre-fetch processing by the first weak protocol read processing 45 setup in the cache controller 30. Let us suppose the case that in themultiprocessor shown in FIG. 4 a request of memory access is issued fromthe processor 12-1. A data request, which follows the memory accessrequest from the processor 12-1, is sent as a request message from thecache device 10-1 to the common bus 14. Additionally, in the presentinvention, pre-fetch by hardware is performed in the cache device 10-1.To make description simple in the present embodiment, correspondingly todata-required addresses ADR from the processor 12-1, data in addresses(ADR+n) are pre-fetched, wherein n is a block size that is arbitrarilyselected. Thus, an adjacent block or adjacent blocks are pre-fetched.Pre-fetch caused by the hardware, which accompanies the data requestfrom the processor 12-1, is sent out as a request message of the weakread protocol to the common bus 14 by processing of the first weakprotocol read processing unit 45 shown in FIG. 6. In other words, aboutnormal read based on the data-requested address ADR from the processor12, the cache controller 30 sends out a request message for the normalread to the common bus 14; after a reply message in response to thisrequest message for the normal read is finished, the cache controller 30uses the first weak protocol read processing unit 45 to send out, to thecommon bus 14, a request message for read based on the weak protocol forpre-fetching the required addresses ADR+n. In such a manner, normal readis distinguished from a request message for the weak protocol read. Whena request message for read based on the weak protocol is issued from thecache controller 30, the other cache devices start snoop operation inthe same manner as for normal read. Specifically, each of the othercache devices examines the required addresses ADR+n obtained in therequest message for read based on the weak protocol, and also examineswhether the data corresponding to the required addresses ADR+n arepresent in its own cache memory or not and, if present, what state thedata fall in. In the case that the corresponding data are present in thecache memory and the data are exclusive (E), the cache controllerasserts the EX line 20. When the data are shared (S), the cachecontroller asserts the HIT line 22. When the data are modified (M), thecache controller asserts the HITM line 24. When the data are invalid(I), the cache controller does not assert any state controlling line.The cache controller 30 of the pre-fetch request source and the maincontroller 16 for the main memory 18 monitor the common bus (snoop bus)14 so that dependently on the state of the other cache devices,processing is performed as follows.

I. In the case that the EX line, the HIT line and HITM line are notasserted and are invalid (I), the pre-fetch request is regarded assuccess. As a result, the data read from the required addresses ADR+nare transmitted from the main memory 18 to the cache device of therequest source. The cache controller 30 of the request source stores thedata transmitted from the main memory 18, in the exclusive state (E), inthe cache array.

II. In the case that only the HIT line is asserted (i.e., thedata-shared state(S)), the cache controller 30 of the request source andthe memory controller 16 for the main memory 18 regard the pre-fetchrequest as success. As a result, the data read from the requiredaddresses ADR+n are read from the main memory 18 and transmitted to thecache device of the request source. The cache controller 30 of therequest source stores the transmitted data, in the data-shared state(S), in the cache array.

III. In the case of the exclusive state (E), wherein the Ex line isasserted, or the data-modified state (M), wherein the HITM line isasserted, that is, in the case that the state of the cache devices ofthe request source must be modified by the normal read protocol, thecache controller 30 of the request source and the memory controller 16for the main memory 18 regard the pre-fetch request as failure, andinterrupt the processing.

FIG. 10 is a flowchart of pre-fetch processing by the first weakprotocol read processing unit shown in FIG. 7. In the case thatpre-fetch of data in the requested addresses ADR+n is requested by thehardware, which accompanies a memory request from the processor 18, instep S1 the processor 18 first searches its own cache memory 32 and thenin step S2 the processor 12 judges whether or not the corresponding dataare present in the cache and are hit. When the data are hit, in step S9read data are given back to the processor and then the successiveprocessing is finished. When the corresponding data are not present inthe cache and are not hit in step S2, in step S3 a request message forread based on the weak protocol is sent into the common bus 14. Inresponse to the request message for read based on the weak protocol, ineach of the other cache devices it is examined whether the datacorresponding to the requested addresses ADR+n are stored and, ifstored, the state of the data is examined. As a result, thecorresponding state controlling line is asserted. In step S4, it isfirst checked whether or not the state is the exclusive state (E) basedon assertion of the EX line or the data-modified state (M) based onassertion of the HITM line. If the state controlling line correspondingto the exclusive state (E) and the data-modified state (M) is asserted,the present processing goes to step S5 so that the reading of thepre-fetch-requested date is regarded as failure and the processing isfinished. If the EX line or the HITM line is not asserted in step S4,the processing goes to step S6 so that it is checked whether or not thepresent state is the data-shared state (S) based on assertion of the HITline. If the HIT line is asserted, the present state is the data-sharedstate (S). Therefore, the processing goes to step S7, so that data inthe requested addresses ADR+n are read and transmitted from the mainmemory 18 and then the data are made into the data-shared state (S) andstored in the cache array. If the HIT line is not asserted in step S6,the present state is the invalid state (I). Thus, the present processingadvances to the step S8, so that data in the requested addresses ADR+nare read and transmitted from the main memory 18 in the same manner andthen the data are stored in the exclusive state (E) in the cache array.As described above, in the case that it is necessary to modify the othercache devices by pre-fetch, that is, in the case that the datacorresponding to pre-fetch are being stored in the exclusive state (E)or the data-modified state (M) in the other cache devices, the pre-fetchrequest is made to a read operation based on the weak protocol and thereading of the pre-fetch-requested data is regarded as failure tointerrupt the processing. Thus, when the processor accesses the data inthe exclusive state (E) or the data-modified state (M) in some othercache devices, in the cache devices in which the same data is stored aspre-fetch data an operation for making their data state into thedata-shared state (S) becomes unnecessary. It is also possible to reduceoverhead at the time of writing by the processor in the cache device inwhich data is stored in the exclusive state (E) or the data-modifiedstate (M).

[Second Weak Protocol Read]

The following will describe pre-fetch processing by the second weakprotocol read processing unit 46 set up in the cache controller 30 shownin FIG. 7. A difference between the second weak protocol read and thefirst weak protocol read is in the structure of the tag 54 in the cacheline 50. The tag 54 has not only the 2-bit data b1 and b2 representingthe state of data in the cache array 58 but also a tag representing aweak state (W) indicating that the data is pre-fetched by the secondweak protocol read. Therefore, it can be said that the second weakprotocol read is a MESIW protocol wherein the (W) state is added to theMESI protocol shown in FIG. 6A. This weak state (W) represents pre-fetchdata in the state that the order of memory transactions is renewed andresults of writing are not reflected. In the processing of the secondweak protocol read, weak consistency is adopted as a memory consistencymodel for keeping consistency of data between memories by software. Theweak consistency is a model in which the order of sync messages forsynchronization operation and memory transactions between the syncmessages is arbitrary, and in which a sync message must be sent outwithout fail when synchronization is necessary between processors. Sucha weak consistency model is described in, for example, “Variousdefinitions of a memory consistency model and an example of a commentarythereon”, pp. 157-158, written by Kazuki Joh in “Parallel ProcessingSymposium JSPP, 97” on May in 1997. If this weak consistency model isadopted, data on which results written in a cache by pre-fetch are notreflected may be present in other cache devices. However, in the casethat a sync message is issued, synchronization operation is not finishedso far as results written in a certain cache device is not reflected onthe other cache devices. Such a weak consistency model is used and inthe second weak protocol read there is adopted a manner of attainingnormal finishing of a pre-fetch request that results in failure by thefirst weak protocol read, which has already been described, by successin reading the pre-fetch request.

Let us suppose that a memory access request is issued, for example, fromthe processor 12-1 to the cache device 10-1 in the multiprocessor systemshown in FIG. 4. Concerning the data request from the processor 12-1, arequest message corresponding to data-requested address ADR is sent out,in accordance with normal read by the read protocol processing unit 40set up in the cache controller 30 shown in FIG. 7, into the common bus14. Since pre-fetch is performed by the hardware in the cache devices ofthe present invention, data in the requested addresses ADR+n are sent,for the pre-fetch request, in response to the data-requested addressADR. This pre-fetch request is sent, as a request message for read basedon the weak protocol by the second weak protocol read processing unitset up in the cache controller 30 shown in FIG. 7, into the common bus14. When the request message in the accordance with the second weakprotocol is issued from the cache controller 30, the other cache devicesstart snoop operation in the same manner as in normal read.Specifically, each of the other cache devices examines the requiredaddresses ADR+n, and also examines whether the data corresponding to therequired addresses are present in its own cache memory or not and, ifpresent, what state the data fall in. In the case that the correspondingdata are present in the cache memory and the data are exclusive (E), thecache controller asserts the EX line. When the data are shared (S), thecache controller asserts the HIT line. When the data are modified (M),the cache controller asserts the HITM line. When the data are invalid(I), the cache controller does not assert any state controlling line. Inthis manner, all of the cache devices and the memory controller 16 areinformed of data-preservation state of the data in the addresses ADR+nrequested for the pre-fetch. The cache controller 30 of the requestsource and the memory controller 16 for the main memory 18 monitor thecommon bus (snoop bus) 14. In the case that neither the EX line, the HITline nor HITM line are asserted and they are invalid (I), the pre-fetchrequest is regarded as success. As a result, the data are transmittedfrom the main memory 18 to the cache device of the request source. Thecache controller 30 of the request source stores the transmitted data,in the exclusive state (E), in the cache array. In the case that onlythe HIT line is asserted (i.e., the data-shared state(S)), the pre-fetchrequest is also regarded as success. As a result, the data in therequired addresses are transmitted from the main memory 18 to the cachedevice of the request source. The cache controller 30 of the requestsource stores the transmitted data, in the data-shared state (S), in thecache array. In the case of the exclusive state, wherein data-change isnecessary and the EX line is asserted in normal read, the pre-fetchrequest is also regarded as success in the second weak protocol read. Asa result, the data in the required addresses are transmitted from themain memory 18 to the cache device of the request source. In this case,the cache controller 30 of the request source stores the transmitteddata, in the weak state (W), in the cache array. In the case of thedata-modified state (M), wherein the HITM line is asserted, the cachecontroller 30 of the request source and the memory controller 16 for themain memory 18 perform either of the following I, II or III:

I. in the accordance with the second weak protocol read, the pre-fetchrequest is regarded as success so that the data are transmitted from themain memory 18 to the cache device of the request source, and stored inthe weak state (W),

II. in the same way as in normal memory transaction, readout is causedto succeed by writing-back, and

III. in the same way as in first weak read protocol, the pre-fetchrequest is regarded as failure so that the processing is interrupted.

FIG. 12 is a flowchart of pre-fetch processing by the second weakprotocol read. In the case that pre-fetch of data in addresses ADR+n isrequested through the hardware by a requested address ADR, whichaccompanies a memory request from the processor 12, in step S1 theprocessor 12 first searches its own cache memory 32 and then theprocessor 12 examines whether or not the data in the requested addressesADR+n are present in the cache array. When the data are present in thecache array and hit in step S2, in step S9 read data are given back tothe processor 12 and then the processing is finished. When the data inthe requested addresses are not present in the cache array, in step S3 arequest message based on the second weak protocol is sent into thecommon bus 14. In response to this request message, in each of the othercache devices it is examined whether the data in the requested addressesADR+n are stored and, if stored, the state of the data is examined. Thestate controlling line corresponding to the result is then asserted. Inthe case of the exclusive state (E), wherein the EX line is asserted, orthe data-modified state (M), wherein the HITM line is asserted, in stepS4, the present processing goes to step S5 so that the data in therequested address ADR+n are read out from the main memory 18, and arethen transmitted to the cache array and stored therein in the state thata bit of the weak state (W) is set up. If the neither EX line nor theHITM line is asserted in step S4, the processing goes to step S6 so thatit is checked whether or not the HIT line is asserted. In the case ofthe data-shared state (S), wherein the HIT line is asserted, theprocessing goes to step S7 so that the transmitted data in the requestedaddress ADR+n from the main memory 18 are stored, in the data-sharedstate (S), in the cache array. In the case of the invalid state (I),wherein the HIT line is not asserted in step S6, the processing goes tostep SB, so that the data in the requested addresses ADR+n aretransmitted to the cache array and stored, in the exclusive state (E),therein. When a synchronization message is issued form any one of theprocessors at the stage of performing the pre-fetch processing inaccordance with the above-mentioned second weak protocol read, the cachecontrollers of the respective cache devices stop receipt of any newtransaction until all of the transactions that are being processedfinish. At the same time, a sync message is sent out into the common bus14 to inform the cache controllers in all the cache devices ofsynchronization processing. The cache controller 30 that received thesync message examines the tag 54 onto the cache array 58 in its owncache memory 32 to search the data in the weak state (W). All data inthe weak state (W) found out by this search are changed into the invalidstate (I) to be invalidated. The synchronization processing based on thesynchronization message from the processor is completed by the end ofthe invalidation of the data in the weak state in all the cache devices.When the data in the exclusive state (E) or the data-modified state (M)are pre-fetched from the main memory to other cache devices (thepre-fetch follows a pre-fetch request), it is also possible in thepre-fetch processing based on the second weak protocol read to preventchange in the states of the other cache devices from being caused. Thepre-fetched data can be forcibly invalidated regardless of the change inthe states of the other cache devices by the synchronization operationin the memory consistency model. In this manner, it Is possible toreduce, without invalidating data newly, overhead at the time of writingdata (i.e., overhead when, in the cache devices in which data in theexclusive state (E) or the data-modified state (M) is stored, thewriting of their processors is caused).

[Passive Preservation Mode]

The following will describe a passive preservation mode, for preservingpre-fetched data in the cache, by the passive reading mode processingunit 48 set up in the cache controller 30 in FIG. 7. This passivepreservation mode is a protocol for invalidating pre-fetched data whendata in the address equal to the address of the pre-fetched datapreserved in the cache is requested by some other cache device.

FIG. 13 illustrates a cache line 50 to which the passive preservationmode is applied. A preservation mode distinguishing tag 60 is newlyfitted up thereto. In this preservation mode distinguishing tag 60, thefollowing is set up: a symbol representing an (N) mode (normal mode)indicating normal preservation, or a symbol representing a (P) mode(passive mode) indicating passive preservation concerned with thepresent invention. The state of data based on 2-bit data b1 and b2 inthe tag 54 is either one of MESI shown in FIG. 9. In this case, thedata-modified state (M) is not caused when data is preserved in the (P)mode, which is a passive preservation manner. Specifically, in the sameway as in the transition states of the MESI protocol shown in FIG. 6B,transition to the data-modified state (M) is caused when in theexclusive state (E) or the data-shared state (S) in the (P) mode theprocessor 12 performs renewal by data-writing. In this case, aninvalidating message is sent out into the common bus to invalidate datastored in the other cache devices. Transition to the data-modified state(M) is then attained. At this time, therefore, the pre-fetched data islost so that the (P) mode is changed to the (N) mode. Accordingly, anydata in the (P) mode does not fall in the data-modified mode (M).

On the other hand, about mode control in the case that data is in the(P) mode when the data is read from any one of the cache arrays, thefollowing methods can be adopted:

I. the normal read mode is given to change the (P) mode to the (N) mode,

II. a reading request from the processor is finished without changingthe (P) mode, and

III. a reading request from the processor is finished without changingthe (P) mode, and subsequently the normal read mode is given to changethe data from the (P) mode to the (N) mode.

In the present embodiment, the above-mentioned method 11, which is thesimplest method, is adopted. About data stored in the (P) mode, noresult of examination of the state of the cache is outputted in theresponse to a request message sent from some other cache devices to thecommon bus, and the data stored in the (P) mode does not change whetherthe data falls in the exclusive state (E) or the data-shared state (S).Therefore, the state of the data stored in the (P) mode is either of theinvalid state (I), or the exclusive state (E) or the data-shared state(S). It is possible in each of the cache memories 32 to store data readby normal data-readout in the (N) mode and store data read by pre-fetchin the (P) mode, which is a passive preservation manner, in each of thecache lines in the cache memory 32.

In the case that data is stored in the (P) mode in any one of the cachememories, the cache controller 30 for causing the data to be stored inthe (P) mode carries out the following operation through the function ofthe passive reading mode processing unit 48 shown in FIG. 7 when arequest message based on the normal read or the weak protocol read issent out into the common bus by a memory access request from some othercache devices. First, the state-examination-result based on the requestmessage sent into the common bus 14 is waited. In this case, the cachecontroller 30 in which the data in the (P) mode is stored does notassert any of the EX line, the HIT line and the HITM line regardless ofthe state of the cache so that to the other cache controllers appear theinvalid state (I) when they are observed from the outside. In the casethat any of the EX line, the HIT line and the HITM line is not assertedon the basis of the result of state-examination in the other cachedevices, that is, in the case that the cache controllers other than thecache controller for causing the data in the (P) mode to be stored areinvalid (I), this controller 30 changes the state of the data in the (P)mode to the invalid state (I) in order to invalidate its own data. Inthis manner, the other cache controller which issued the read requestcan read data in the requested address, in the exclusive state (E), fromthe main memory 18 and cause the data to be stored. In the case that theEx line or the HITM line is asserted and the data is stored, in theexclusive state (E) or the data-modified state (M), in any one of theother cache devices, the cache controller 30 for causing the data to bestored in the (P) mode changes the state of the data in the (P) mode tothe invalid state (I). On the other hand, in the case that the othercache controllers are in the data-shared state (S), wherein the HIT lineis asserted, on the basis of the result of the state-examination of theother controllers the cache controller 30 for causing the data in the(P) mode to be stored does not need to change its own data.

FIG. 14 is a flowchart of operation of the cache device in which data ina requested address is stored in the (P) mode when a request messagebased on read from some other cache device is sent out in the commonbus. In step SI, a request message based on read from some other cachedevice is sent out into the common bus 14. As a result, the cache devicechecks whether or not the cache device itself stores data in the addressrequested by the request message as a copy in the (P) mode (passivepreservation) in step S2. If the cache device stores the data in therequested address in the (P) mode, it is checked whether or not the HITline is asserted in step S3. If the Hit line is not asserted, that is,if all of the other cache devices are in the invalid state (I) or eitherone thereof is in the exclusive state (E) or the data-modified state(M), the state of the data in the (P) mode is made invalid (I) in stepS4. In the case of the data-shared state (S), wherein the HIT line isasserted, in step 3, the invalidation in step S4 is skipped and thepresent processing is finished since it is unnecessary to change thestate of the data stored in the (P) mode. By making data pre-fetchedinto any one of the caches into the (P) mode, which is a passivepreservation mode, as described above, useless sharing of the data bypre-fetch based on read from other cache devices can be avoided as muchas possible and overhead at the time of writing specified data in thedata-shared state can be reduced.

[Integration of a Request and a Reply]

FIG. 15 illustrates an embodiment of a cache device and a common bus forputting a normal memory request, a pre-fetch request and a replytogether and transmitting them when the two requests overlap with eachother. A common bus 14 has two state controlling lines for sending outresults of the states of cache devices, that is, two systematic lines.One is a normal control system line 62, and the other is a pre-fetchsystem line 64. Specifically, the normal control system line 62 has anEX line 20, a HIT line 22 and a HITM line 24 in the same manner as shownin FIG. 5. The pre-fetch control system 64 has an EX line 66, a HIT line68 and a HIT line 70. Of course, the common bus 14 has not only thenormal control system line 62 and the pre-fetch control system line 64but also an address bus 26, a data bus 28 and a command bus 29. When acache controller of the cache device 10 receives a reply message basedon normal read through the common bus 14, the controller sends out theresult obtained by examining the state of data in a requested address tothe normal control system line 62. When the cache controller receives areply message in response to a pre-fetch request, the controller sendsthe state of data in a requested address to the pre-fetch control systemline 64. As shown by the cache line in FIG. 13, the preservation of datain the cache device 10 is data-preservation in which the (P) mode, whichis a passive preservation manner, is set in the preservation modedistinguishing tag 60. The cache controller in the cache device 10 usingthe normal control system line 62 and the pre-fetch control system line64 adopts a manner of putting a normal read request and a pre-fetchrequest together into a request message in response to a data readrequest from the processor. When a data request is sent out from theprocessor 12 in the cache device 10 shown in FIG. 5, a request messagein accordance with a normal data request and a request message for adata request for pre-fetch using the weak protocol read are successivelyand separately sent out to the common bus 14. The embodiment shown inFIG. 15 has a function that when a normal data request from theprocessor is received and then a request message for a data request issent out to the common bus 14, the main controller 16 for the mainmemory 18 and other cache devices interpret that the message includesnot only the normal data request but also a data request based on theweak protocol for pre-fetch. In other words, the memory controller 16for the main memory and other cache devices interpret that a requestmessage which requests data in an address ADR and is sent out from thecache device of the request source to the common bus 14 serves both asthe normal data request and a pre-fetch request of data in addressesADR+n. In this manner, it is unnecessary to sent out the request messagefor normal read request and the accompanying request message forpre-fetch based on the weak protocol read separately from the cachedevice of the request source to the common bus 14. Thus, overhead ofexcessive requests can be reduced. In the case that any one of the cachedevices has data to be pre-fetched by accompaniment of a normal datarequest in its own cache memory, the cache device does not need to sentout a data request for pre-fetch based on the weak protocol to thecommon bus. Thus, a request message having a request format 72 shown inFIG. 16 is sent out. This request format 72 has a request source ID 74,a command 76, a normal data validity bit 78, a pre-fetch data validitybit 80 and an address 82 in this order. Thus, as shown In a request sidein FIG. 18, either one of request messages for requesting normal dataand pre-fetch data, for requesting only normal data or for requestingonly pre-fetch data can be selectively issued by controlling a bit D inthe normal data validity bit 78 or a bit B in the pre-fetch datavalidity bit 80. Any one of the cache devices can inform, at a time, theother cache devices of both results obtained by examining the states ofthe normal data and the pre-fetch data, in response to a normal datarequest and a pre-fetch data request issued in one request message, bysetting up the normal control system line 62 and the pre-fetch controlsystem line 64 in connection with such a request message making itpossible to request the normal data and the pre-fetch data together.

In this embodiment, a replay format 84, for giving back data at a timein response to a request message for requesting normal data andpre-fetch data at the same time is adopted, as shown in FIG. 17. Thereplay format 84 is composed of a header section 86 and a data section88. The header section 86 has a request source ID 90, a command 92, anormal data validity bit 95, a pre-fetch data validity bit 96, and anaddress 98. If the D bit in the normal data validity bit 95 is valid,data 0 read in response to a normal data request is added to the rear ofthe header section 86. If the bit B in the pre-fetch data validity bit96 is valid, data 1 to n for pre-fetch, which correspond to n blocksthat are beforehand decided, are arranged in the rear of the data 0. Therelationship between the reply format 84 and the request format 72 is asshown in FIG. 18. Specifically, when the request message is“normal+pre-fetch”, the reply message is “normal+pre-fetch” or “normal”only. When the request message is “normal” only, the reply message isalso “normal” only. When the request message is “pre-fetch” only, thereply message is also “pre-fetch” only. Even if a normal memory requestand a pre-fetch request overlap with each other, the two requests can betransmitted in a lump by putting the two requests and the replaytogether into one unit and further setting up two system statecontrolling lines (snoop buses) for conveying results obtained byexamining the states of normal data and pre-fetch data in the cachedevices. Thus, overhead based on an increase in data-transmission can bemade as small as possible.

In the above-mentioned embodiments, as an example of the common bus thatconnects cache devices of a multiprocessor, a snoop bus is used.However, an appropriate common bus as an interconnecting network havingan equivalent function may be used. The above-mentioned embodiments areexamples of data consistency management based on the MESI protocol, buta cache coherence protocol, as well as this protocol, may be used. Thepresent invention includes appropriate modifications so far as theyneither damage the object nor the advantages thereof. Furthermore, thepresent invention is not limited by the numbers described in theembodiments.

[Industrial Applicability]

As described above, according to the cache device of the presentinvention, by using weak reading operation for pre-fetch of amultiprocessor system, useless sharing of pre-fetch data and normal datacan be avoided as much as possible. Thus, overhead of writing on thecache device can be made small.

Furthermore, by embedding a pre-fetch request accompanying a normal readrequest in the normal request and uniting normal read and a reply topre-fetch, overhead of the data-transmission, resulting from an increasein pre-fetch, can be suppressed

1. A cache device set up in each of processors, interconnected to othercache devices in other processors and connected to a main memory, whichcomprises: a cache memory wherein a part of data in the main memory isstored in one or more cache lines and a state tag used to manage dataconsistency is set up in each of the cache lines, and a cache controllerperforming, as a pre-fetch protocol, a weak read operation that fails apre-fetch request following a read request from one of the processors,if at a time of generation of the pre-fetch request, the state tags ofother cache devices must be changed to read the data stored in the othercache devices.
 2. The cache device according to claim 1, wherein saidcache memory distinguishes the stored data by a data-modified state (M),an exclusive state (E), a data-shared state (S) and an invalid state(I), each of which indicates validity of the state tag, and said cachecontroller causes failure in said pre-fetch request when the datacorresponding to the pre-fetch request stored in the other cache devicesis in the data-modified state (M) or the exclusive state (E).
 3. Thecache device according to claim 1, wherein said cache controller reads,when the data corresponding to the pre-fetch request and stored in theother cache devices is in the invalid state (I), the same data from saidmain memory and stores the same data in the exclusive state (E) in thecache memory; and when the data is in the data-shared state (S), thecache controller reads the data from the other cache devices and storesthe data in the data-shared state (S) in the cache memory.
 4. The cachedevice according to claim 1, wherein said cache memory distinguishes thestored data by a data-modified state (M), an exclusive state (E), adata-shared state (S) and an invalid state (I), each of which indicatesvalidity of the state tag, and said cache controller reads, when thedata which corresponds to the pre-fetch request and are stored in theother cache devices is in the data-modified state (M) or the exclusivestate (E), the data without changing the state tag and stores the datain the cache memory with the setup of the weak state (W), and at thetime of synchronization operation of the memory consistency the cachecontroller changes the weak state (W) into the invalid state (I) wholly.5. The cache device according to claim 4, wherein said cache controllerreads, when the data corresponding to the pre-fetch request and storedin the other cache devices is in the invalid state (I), the same datafrom said main memory and stores the same data in the exclusive state(E) in the cache memory; and when the data is in the data-shared state(S), the cache controller reads the data from the other cache devicesand stores the data in the data-shared state (S) in the cache memory. 6.The cache device according to claim 1, wherein said cache controllercarries out, when the cache controller receives the read request fromsaid processor, a pre-fetch request for pre-fetching data in one or moreaddresses adjacent to a read-requested address after said read request.7. The cache device according to claim 6, wherein the cache device isinterconnected to the other cache devices via a snoop bus foroutputting, when said cache controller receives a read request from itsown processor or some other cache devices, preservation states of thecorresponding data into state controlling lines, wherein the statecontrolling lines comprise a first state controlling line thatexclusively corresponds to the read request and a second statecontrolling line that exclusively corresponds to the pre-fetch request,wherein the cache controller carries out the read request and thepre-fetch request at same time, and carries out, via the first andsecond state controlling lines, the states of the respective cachedevices about an address of the requested data and an address of thepre-fetch requested data at same time.
 8. The cache device according toclaim 6, wherein in response to the simultaneous requests of said readrequest and the pre-fetch request, a distinguishing bit fordistinguishing data in response to said read request and data inresponse to the pre-fetch request is fitted up to a response header, anddata making the distinguishing bit valid are transmitted in a lump.
 9. Acache device set up in each of processors, interconnected to other cachedevices in other processors and connected to a main memory, whichcomprises: a cache memory wherein a part of data in the main memory isstored in one or more cache lines and a state tag used to manage dataconsistency is set up in each of the cache lines, and a cache controllerin response to a pre-fetch request following a read request from one ofthe processors, reading data without changing state tags of other cachedevices and storing the read data in the cache memory with setup of aweak state (W), if at a time of generation of the pre-fetch request, thestate tags of the other cache devices must be changed to read the datastored in the other cache devices, and invalidating the data stored inthe cache memory in the weak state W at a time of synchronizationoperation of memory consistency to attain data-consistency by software.10. The cache device according to claim 9, wherein said cache controllercarries out, when the cache controller receives the read request fromsaid processor, a pre-fetch request for pre-fetching data in one or moreaddresses adjacent to a read-requested address after said read request.11. A cache device set up in each of processors, interconnected to othercache devices in other processors and connected to a main memory, whichcomprises: a cache memory wherein a part of data in the main memory isstored in one or more cache lines and a state tag used to manage dataconsistency is set up in each of the cache lines, and a cache controllercontrolling a pre-fetch protocol according to a process comprising:setting as a state tag, at a time of generation of a pre-fetch requestfollowing a read request from one of the processors, a passivepreservation mode P to data pre-fetched from other cache devices or fromthe main memory, storing the pre-fetched data in said cache memory, notinforming the other cache devices of the preservation of the data insaid cache memory, when data for a read request from the other cachedevices corresponds to the pre-fetch data to which said passivepreservation mode P is set, and invaliding the pre-fetched data in thecache memory, when according to the read request from the other cachedevices, none of the other cache devices store the corresponding data,and preserving said pre-fetch data as it is, when according to the readrequest from the other cache devices, the other cache devices share thecorresponding data.
 12. The cache device according to claim 11, whereinsaid cache memory distinguishes the stored data by a data-modified state(M), an exclusive state (E), a data-shared state (S) and an invalidstate (I), each of which indicates validity of the state tag, and in thecase that the data corresponding to the read request from some othercache device is the pre-fetch data to which said passive preservationmode P is set up, said cache controller changes the pre-fetch datastored in the passive preservation mode P into the invalid state (I)when all of the other cache devices are in the invalid state (I), oreither one of the other cache devices is in the data-modified state (M)or the exclusive state (E), and the cache device keeps the state of thepre-fetch data stored in the passive preservation mode P as it is whenthe other cache devices are in the data-shared state (S).
 13. The cachedevice according to claim 11, wherein a normal preservation mode N isset up to data other than the pre-fetch data in the passive preservationmode P stored in said cache memory, and data-preservation in the passivepreservation mode P and data-preservation in the normal preservationmode N are carried out in the respective cache lines, and caused toexist together.
 14. A method of controlling a cache system wherein cachedevices set up in respective processors are mutually connected throughan interconnecting network and are connected to a main memory, themethod comprising: storing a part of data in the main memory in one ormore cache lines on cache memory and setting up a state tag to managedata consistency in each of the cache lines, and performing, aspre-fetch protocol, a weak read operation that fails a pre-fetch requestfollowing a read request from one of the processors, if at a time ofgeneration of the pre-fetch request, the state tags of other cachedevices must be changed to read the data stored in the other cachedevices.
 15. A method of controlling a cache system wherein cachedevices set up in respective processors are mutually connected throughan interconnecting network and are connected to a main memory, themethod comprising: storing a part of data in the main memory in one ormore cache lines on cache memory and setting up a state tag to managedata consistency in each of the cache lines, in response to a pre-fetchrequest following a read request from one of the processor, reading datawithout changing state tags of other cache devices and storing the readdata in the cache memory with setup of a weak state (W), at a time ofgeneration of the pre-fetch request, the state tags of the other cachedevices must changed to read the data stored in the other cache devices,and invalidating the data stored in the cache memory in the weak state(W) at a time of synchronization operation of memory consistency toattain data-consistency by software.
 16. A method of controlling a cachesystem wherein cache devices set up in respective processors aremutually connected through an interconnecting network and are connectedto a main memory, the method comprising: storing a part of data in themain memory in one or more cache lines on cache memory and setting up astate tag to manage data consistency in each of the cache lines, settingas a state tag, at a time of generation of a pre-fetch request followinga read request from one of the processors, a passive preservation mode Pto data pre-fetched from other cache devices or the main memory andstoring the pre-fetched data in said cache memory, not informing theother cache devices of the preservation of the data in said cachememory, when data for a read request from the other cache devicescorresponds to the pre-fetch data to which said passive preservationmode P is set, and invalidating said pre-fetch data when, according tothe read request from the other cache devices, none of the cache devicesstore the corresponding data, and storing said pre-fetch data as it iswhen, according to the read request from the other cache devices, thecorresponding data is shared by the other cache devices.